Goto

Collaborating Authors

 statistical physics







Laws of thermodynamics for exponential families

Balsubramani, Akshay

arXiv.org Artificial Intelligence

Most learning problems can be solved by minimization of log loss. This bare fact is inescapable in modern AI and machine learning - the variety is in the details. What is the space of measured data? What is the support of the distribution? Changing such properties of the problem fundamentally changes learning behavior, leading to the variety of modeling approaches successfully used in data science. But for many inference and decision-making tasks, log loss can be axiomatically inescapable. We explore such loss minimization problems in the language of statistical mechanics, which studies how systems of "particles" like atoms can be approximately described by relatively few bulk properties. There is a direct analogue to modeling, where large datasets are described by relatively few model parameters.


Machine learning and optimization-based approaches to duality in statistical physics

Ferrari, Andrea E. V., Gupta, Prateek, Iqbal, Nabil

arXiv.org Artificial Intelligence

The notion of duality -- that a given physical system can have two different mathematical descriptions -- is a key idea in modern theoretical physics. Establishing a duality in lattice statistical mechanics models requires the construction of a dual Hamiltonian and a map from the original to the dual observables. By using simple neural networks to parameterize these maps and introducing a loss function that penalises the difference between correlation functions in original and dual models, we formulate the process of duality discovery as an optimization problem. We numerically solve this problem and show that our framework can rediscover the celebrated Kramers-Wannier duality for the 2d Ising model, reconstructing the known mapping of temperatures. We also discuss an alternative approach which uses known features of the mapping of topological lines to reduce the problem to optimizing the couplings in a dual Hamiltonian, and explore next-to-nearest neighbour deformations of the 2d Ising duality. We discuss future directions and prospects for discovering new dualities within this framework.


Reviews: Mean-field theory of graph neural networks in graph partitioning

Neural Information Processing Systems

A GNN (graph neural network) is a neural network whose input is a graph. This paper studies the problem of using a GNN to detect clusters in a graph drawn from the 2-groups SBM (stochastic block model: a popular model for random graphs with community structure). Although it is already known how to optimally solve the SBM (i.e. Thus, it is of interest to analyze the performance of GNN on the SBM. In the GNN architecture studied here, each layer has a node for each vertex in the graph.


Reviews: The committee machine: Computational to statistical gaps in learning a two-layers neural network

Neural Information Processing Systems

The committee machine is a simple and natural model for a 2-layer neural network. The results of the paper also apply to many related models: you are allowed an arbitrary function mapping the K hidden values to the final binary output.) This paper studies the problem of learning the weights W under a natural random model. We are given m random examples (X,Y) where the input X (in R n) is iid Gaussian and Y (in { 1,-1}) is the associated output of the network. The unknown weights W are iid from a known prior.


Dynamic neurons: A statistical physics approach for analyzing deep neural networks

Lee, Donghee, Lee, Hye-Sung, Yi, Jaeok

arXiv.org Artificial Intelligence

Deep neural network architectures often consist of repetitive structural elements. We introduce a new approach that reveals these patterns and can be broadly applied to the study of deep learning. Similar to how a power strip helps untangle and organize complex cable connections, this approach treats neurons as additional degrees of freedom in interactions, simplifying the structure and enhancing the intuitive understanding of interactions within deep neural networks. Furthermore, it reveals the translational symmetry of deep neural networks, which simplifies the application of the renormalization group transformation - a method that effectively analyzes the scaling behavior of the system. By utilizing translational symmetry and renormalization group transformations, we can analyze critical phenomena. This approach may open new avenues for studying deep neural networks using statistical physics.